Model Selection

Local region description

# Local region description

DAM-3B-Video is a 3-billion-parameter vision-language model capable of generating fine-grained local descriptions for user-specified image/video regions.

Safetensors English

Featured Recommended AI Models

AIbase

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご

© 2025AIbase